Muxing: a Telephone-access Mandarin Conversational System1
نویسندگان
چکیده
MUXING is a telephone-based conversational system that allows users to access weather information in Mandarin Chinese over the telephone. Although MUXING utilizes the same architecture as well as most of the same human language technology components as its English predecessor, JUPITER, some modifications to the system were necessary to account for differences between English and Mandarin Chinese. In addition, the weather database needed to be modified to reflect regions of greater interest to potential Chinese users. This paper describes our system development effort, paying particular attention to Mandarinspecific changes to the original JUPITER system.
منابع مشابه
Improving Language Models for Mandarin Conversational Speech Recognition with Web Data
Lack of data is a problem in training language models for conversational speech recognition, particularly for languages other than English. Experiments in English have successfully used webbased text collection targeted for a conversational style to augment small sets of transcribed speech; here we look at extending these techniques to Mandarin. In addition, we investigate different techniques ...
متن کاملA preliminary study of Mandarin filled pauses
The paper reports preliminary results on Mandarin filled pauses (FPs), based on a large speech corpus of Mandarin telephone conversation. We find that Mandarin intensively uses both demonstratives (zhege ‘this’, nage ‘that’) and uh/ mm as FPs. Demonstratives are more frequent FPs and are more likely to be surrounded by other types of disfluency phenomena than uh/mm, as well as occurring more of...
متن کاملHKUST/MTS: A Very Large Scale Mandarin Telephone Speech Corpus
The paper describes the design, collection, transcription and analysis of 200 hours of HKUST Mandarin Telephone Speech Corpus (HKUST/MTS) from over 2100 Mandarin speakers in mainland China under the DARPA EARS framework. The corpus includes speech data, transcriptions and speaker demographic information. The speech data include 1206 ten-minute natural Mandarin conversations between either stran...
متن کامل2000 Nist Evaluation of Conversational Speech Recognition over the Telephone: English and Mandarin Performance Results
This paper documents the use of conversational telephone speech test materials in the NIST coordinated evaluation conducted early in 2000. The primary evaluation was of General American English speech, but a subsidiary evaluation of Mandarin speech was also offered. The primary test data consisted of twenty conversations collected for the original Switchboard Corpus but not released with the pu...
متن کاملDetection of word fragments in Mandarin telephone conversation
We describe preliminary work on the detection of word fragments in Mandarin conversational telephone speech. We extracted prosodic, voice quality, and lexical features, and trained Decision Tree and SVM classifiers. Previous research shows that glottalization features are instrumental in English fragment detection. However, we show that Mandarin fragments are quite different than English; 90% o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000